Presenter: Olivia Beck
Content Credit: Matthew Beckman, Hadley
Wickham
May 17, 2023
“Happy families are all alike; every unhappy family is unhappy in its own way.” –– Leo Tolstoy
“Tidy datasets are all alike, but every messy dataset is messy in its own way.” –– Hadley Wickham
Variable
Cases
There are three interrelated rules which make a dataset tidy:
It is your job as the researcher to define the variables, observations, and values.
Example of Untidy data
Example of Tidy Data
In the 1880s, Francis Galton started to make a mathematical theory of evolution.
Here’s part of a page from his lab notebook. Discuss the following in groups:
A page from Francis Galton’s notebook.
Work to put these tables in tidy form
A page from Francis Galton’s notebook.
A codebook describes the contents, structure, and layout of a data collection.
A well-documented codebook contains information intended to be complete and self-explanatory for each variable in a data file
Federal Elections Comission